Quality evaluation of value sets from cancer study common data elements using the UMLS semantic groups

نویسندگان

  • Guoqian Jiang
  • Harold R. Solbrig
  • Christopher G. Chute
چکیده

OBJECTIVE The objective of this study is to develop an approach to evaluate the quality of terminological annotations on the value set (ie, enumerated value domain) components of the common data elements (CDEs) in the context of clinical research using both unified medical language system (UMLS) semantic types and groups. MATERIALS AND METHODS The CDEs of the National Cancer Institute (NCI) Cancer Data Standards Repository, the NCI Thesaurus (NCIt) concepts and the UMLS semantic network were integrated using a semantic web-based framework for a SPARQL-enabled evaluation. First, the set of CDE-permissible values with corresponding meanings in external controlled terminologies were isolated. The corresponding value meanings were then evaluated against their NCI- or UMLS-generated semantic network mapping to determine whether all of the meanings fell within the same semantic group. RESULTS Of the enumerated CDEs in the Cancer Data Standards Repository, 3093 (26.2%) had elements drawn from more than one UMLS semantic group. A random sample (n=100) of this set of elements indicated that 17% of them were likely to have been misclassified. DISCUSSION The use of existing semantic web tools can support a high-throughput mechanism for evaluating the quality of large CDE collections. This study demonstrates that the involvement of multiple semantic groups in an enumerated value domain of a CDE is an effective anchor to trigger an auditing point for quality evaluation activities. CONCLUSION This approach produces a useful quality assurance mechanism for a clinical study CDE repository.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of Data Models for Nursing Assessment of Cancer Survivors Using Concept Analysis

OBJECTIVES Sharing of cancer-related information among healthcare professionals is crucial to ensuring the quality of long-term care for cancer survivors. Appropriate distribution of the essential facts can be achieved using data models. The purpose of this study was to develop and validate suitable data models for use in the nursing assessment of cancer survivors. METHODS The models develope...

متن کامل

Automated Tools for Clinical Research Data Quality Control using NCI Common Data Elements

Clinical research data generated by a federation of collection mechanisms and systems often produces highly dissimilar data with varying quality. Poor data quality can result in the inefficient use of research data or can even require the repetition of the performed studies, a costly process. This work presents two tools for improving data quality of clinical research data relying on the Nation...

متن کامل

طرح نقشه نمایی مفاهیم طبّ سنّتی ایران در ساختار ابراصطلاحنامه و شبکه معنایی«(UMLS) نظام زبان واحد پزشکی »

Introduction: This research was aimed to analyze mapping scheme of Traditional Iranian Medicine (TIM) with structure of common language of meta- thesaurus and Semantic network Unified Medical System Language (UMLS). The domain, location and relation of TIM in the UMLS is designed, and recitation of location and proportion of the TIM’s concepts are provided. Methods: This is a triphasic research...

متن کامل

Study of Spatial Data Quality Elements and VGI Linear Data Quality Assessment Methods

Volunteered Geographic Information has provided a rich and valuable resource for spatial data in a variety of applications. Despite the many benefits, this information does not provide any guarantee for their quality. So far, there are several methods to determine the quality of VGI. In addition to introducing quality elements and their evaluation methods, the present study attempts to explore ...

متن کامل

Development and Validation of the Coping Model based on the Semantic-value Components of Language in Relapse Prevention among Substance Abusers

Objective: This research sought to develop and validate a coping model based on semantic-value components of language in relapse prevention of substance abuse in Isfahan in 2016. Method: A mixed research method was used in this regard. The sample units in the qualitative section included a number of substance abusers that were selected via purposive sampling method. The sample size in this sect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 19  شماره 

صفحات  -

تاریخ انتشار 2012